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Creating a depth map 



The invention relates to a method of generating a depth map comprising depth 
values representing distances to a viewer, for respective pixels of an image. 

The invention further relates to a depth map generating unit for generating a 
depth map comprising depth values representing distances to a viewer, for respective pixels 
5 of an image. 

The invention further relates to an image processing apparatus comprising: 
receiving means for receiving a signal corresponding to an image; and 
such a depth map generating unit for generating a depth map. 
The invention further relates to a computer program product to be loaded by a 
10 computer arrangement, comprising instructions to generate a depth map comprising depth 

values representing distances to a viewer, for respective pixels of an image, the computer 

arrangement comprising processing means and a memory. 

15 In order to generate a 3D impression on a multi-view display device, images 

from different virtual view points have to be rendered. This requires either multiple input 
views or some 3D or depth information to be present. This depth information can be 
recorded, generated from multiview camera systems or generated from conventional 2D 
video material. For generating depth information from 2D video several types of depth cues 

20 can be applied: such as structure from motion, focus information, geometric shapes and 
dynamic occlusion. The aim is to generate a dense depth map, i.e. per pixel a depth value. 
This depth map is subsequently used in rendering a multi-view image to give the viewer a 
depth impression. In the article "Synthesis of multi viewpoint images at non-intermediate 
positions" by P.A. Redert, E.A. Hendriks, and J. Biemond, in Proceedings of International 

25 Conference on Acoustics, Speech, and Signal Processing, Vol. IV, ISBN 0-8186-7919-0, 
pages 2749-2752, IEEE Computer Society, Los Alamitos, California, 1997 a method of 
extracting depth information and of rendering a multi- view image on basis of the input image 
and the depth map are disclosed. 
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It is an object of the invention to provide a method of the kind described in the 
opening paragraph, which is based on a new depth cue. 

This object of the invention is achieved in that the method comprises: 
5 - computing a cost value for a first one of the pixels of the image by combining 

differences between values of pixels which are disposed on a path from the first one of the 
pixels to a second one of the pixels which belongs to a predetermined subset of the pixels of 
the image; and 

assigning a first one of the depth values corresponding to the first one of the 

10 pixels on basis of the cost value. 

The invention is based on the following observation. Objects in a scene to be 
imaged have different sizes, luminances, and colors and have a certain spatial disposition. 
Some of the objects occlude other objects in the image. Differences between luminance 
and/or color values of pixels in an image are primarily related to the differences between 

15 optical characteristics of the surfaces of the objects and related to the spatial positions of 

objects relative to light sources within the scene. Optical characteristics of surfaces comprise 
e.g. color and reflectiveness. Hence, a relatively large transition in luminance and/or color, 
i.e. a relatively big difference between pixel values of neighboring pixels corresponds to a 
transition between a first image segment and a second image segment, whereby the first 

20 image segment corresponds to a first object and the second image segment corresponds to a 
second object in the scene being imaged. By determining for the pixels of the image the 
number of and extend of transitions in luminance emd/or color, i.e. differences between pixel 
values on a path from the respective pixels to a predetermined location of the image, 
respective measures related to the spatial disposition of the objects in the scene can be 

25 achieved. These measures, i.e. cost values are subsequently translated into depth values. This 
translation is preferably a multiplication of the cost value with a predetermined constant. 
Alternatively, this translation corresponds to a mapping of the respective cost values to a 
predetermined range of depth values by means of normalization. 

It should be noted that the background also forms one or more objects, e.g. the 

30 sky or a forest or a meadow. 

The depth value which is based on the luminance and/or color transients can 
be directly used as depth value for rendering a multi-view image, e.g. as described in the 
cited article. Preferably, the depth value according to the invention is combined with other 
depth values, which are based on alternative depth cues as mentioned above. 
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In an embodiment of the method according to the invention, a first one of the 
differences is equal to a difference between respective values of neighboring pixels, which 
are disposed on the path. Computing a difference between two adjacent pixels is relatively 
easy. Alternatively, a difference is based on more than two pixel values. In a further 
5 alternative, a difference is computed between two pixels which are both on the path but 

which are not adjacent, e.g. the difference is computed between a minimum and a maximum 
pixel value whereby the minimum and maximum pixel value corresponds to respective pixels 
which are located within a predetermined distance. Preferably, an absolute difference 
between respective values of pixels, which are disposed on the path, is computed. 

10 In an embodiment of the method according to the invention, the cost value for 

the first one of the pixels is computed by accumulating the differences between the values of 
the pixels, which are disposed on the path. Accumulation, i.e. integration, summation or 
addition of differences is relatively easy to implement. Preferably, only the differences which 
are larger than a predetermined threshold are combined by means of accumulation. An 

1 5 advantage of applying a threshold is that the depth value determination is less sensitive to 
noise within the image. 

In another embodiment of the method according to the invention, the cost 
value for the first one of pixels is computed by accumulating products of differences between 
the values of the pixels, which are disposed on the path, and respective weighting factors for 

20 the differences. By applying weighting factors, it is possible to control the contributions of 
pixel value differences for the computation of depth values corresponding to the respective 
pixels. For example, a first one of the weighting factors, which is related to a difference 
between a value of a particular pixel and a value of its neighboring pixel, is based on a 
distance between the particular pixel and the first one of the pixels. The first one of the 

25 weighting factors is typically relatively low if the distance between the particular pixel and 
the first one of the pixels is relatively high. For example, a second one of the weighting 
factors, which is related to a difference between a value of a particular pixel and a value of its 
neighboring pixel, is based on the location of the neighboring pixel related to the particular 
pixel. E.g. the second one of the weighting factors is relatively high if the neighboring pixel 

30 is located above the particular pixel and is relatively low if the neighboring pixel is located 
below the particular pixel. Alternatively, the second one of the weighting factors is related to 
the angle between a first vector and a second vector, whereby the first vector corresponds to 
the location of the neighboring pixel related to the particular pixel and the second vector 
corresponds to the location of the first one of pixels related to the second one of the pixels. 
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An embodiment according to the invention, further comprises: 

computing a second cost value for the first one of the pixels of the image by 
combining differences between values of pixels which are disposed on a second path from 
the first one of the pixels to a third one of the pixels which belongs to the predetermined 
5 subset of the pixels of the image; 

determining the minimum of the cost value and the second cost value; 

assigning the first one of the depth values corresponding to the first one of the 
pixels on basis of the minimum. 

In this embodiment according to the invention, the first one of depth values is 
10 based on a particular selection of multiple values related to muhiple paths, i.e. the optimum 
path from the first one of the pixels to the second one of the pixels. Notice that the second 
one of pixels and the third one of the pixels may be mutually equal, i.e. the same. An 
alternative type of selection or combination of the cost values related to respective paths is 
advantageous. For instance an average of cost values related to multiple paths can be 
1 5 computed. 

Another embodiment of the method according to the invention further 
comprises computing a second cost value for a third one of the pixels on basis of the cost 
value for the first one of the pixels. Making reuse of already computed cost values results in a 
computing efficient implementation. Typically, computing the second cost value is 

20 performed by combining the cost value of the first one of the pixels with a difference 

between further values of further pixels which are disposed on a second path from the third 
one of the pixels to the first one of the pixels. 

In an embodiment of the method according to the invention, whereby cost 
values corresponding to respective pixels of the image are successively computed on basis of 

25 further cost values being computed for further pixels, a first scan direction of successive 

computations of cost values for a first row of pixels of the image is opposite to a second scan 
direction of successive computations of cost values for a second row of pixels of the image. 
Typically for each of the pixels of the image a depth value has to be computed. Preferably, 
usage is made of cost values already computed for other pixels when computing a particular 

30 cost value for a particular pixel. The order in which the successive pixels are processed, i.e. 
the depth values are computed, is relevant. Preferably, the order is such that the pixels of the 
image are processed row-by-row or alternatively column-by-column. If the pixels are 
processed row-by-row then it is advantageous to processes the subsequent rows in reverse 
order, e.g. the even rows from left to right and the odd rows from right to left or vice versa. 



wo 2005/083631 



PCT/IB2005/050482 



5 



The inventors have observed that this zigzag type of processing results in much better results 
than a processing whereby all rows are processed in the same scan direction. The quality of 
the depth map created on basis of this zigzag type of processing, is comparable with results 
from more expensive methods of determining cost values for respective paths. With more 
5 expensive is meant that more paths are evaluated in order to determine the optimal path. 

It is a further object of the invention to provide a depth map generating unit of 
the kind described in the opening paragraph, which is based on a new depth cue. 

This object of the invention is achieved in that the generating unit comprises: 

computing means for computing a cost value for a first one of the pixels of the 
10 image by combining differences between values of pixels which are disposed on a path from 
the first one of the pixels to a second one of the pixels which belongs to a predetermined 
subset of the pixels of the image; and- 

assigning means for assigning a first one of the depth values corresponding to 
the first one of the pixels on basis of the cost value. 
15 It is a further object of the invention to provide an image processing apparatus 

comprising a depth map generating unit of the kind described in the opening paragraph, 
which is arranged to generate a depth map based on a new depth cue. 

This object of the invention is achieved in that the generating unit comprises: 

computing means for computing a cost value for a first one of the pixels of the 
20 image by combining differences between values of pixels which are disposed on a path fi-om 
the first one of the pixels to a second one of the pixels which belongs to a predetermined 
subset of the pixels of the image; and 

assigning means for assigning a first one of the depth values corresponding to 
the first one of the pixels on basis of the cost value. 
25 It is a further object of the invention to provide a computer program product of 

the kind described in the opening paragraph, which is based on a new depth cue. 

This object of the invention is achieved in that the computer program product, 
after being loaded, provides said processing means with the capability to carry out: 

computing a cost value for a first one of the pixels of the image by combining 
30 differences between values of pixels which are disposed on a path from the first one of the 
pixels to a second one of the pixels which belongs to a predetermined subset of the pixels of 
the image; and 

assigning a first one of the depth values corresponding to the first one of the 
pixels on beisis of the cost value. 
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Modifications of the depth map generating unit and variations thereof may 
correspond to modifications and variations thereof of the image processing apparatus, the 
method and the computer progreim product, being described. 



5 

These and other aspects of the depth map generating unit, of the image 
processing apparatus, of the method and of the computer program product, according to the 
invention will become apparent from and will be elucidated with respect to the 
implementations and embodiments described hereinafter and with reference to the 
10 accompanying drawings, wherein: 

Fig. 1 schematically shows an image and the corresponding depth map being 
generated with the method according to the invention; 

Fig. 2 schematically shows two paths; 

Fig. 3 schematically shows a path from a first pixel to a second pixel, a first 
1 5 vector from the first pixel to the second pixel and a second vector from a third pixel to a 
fourth pixel on the path; 

Fig. 4 schematically shows a multi-view image generation unit comprising a 
depth map generation unit according to the invention; 

Fig. 5 schematically shows an embodiment of the image processing apparatus 
20 according to the invention; and 

Fig. 6 schematically shows the sequence of processing respective pixels of the 

image. 

Same reference numerals are used to denote similar parts throughout the 

Figures. 

25 



Fig. 1 schematically shows an image 100 and the corresponding depth map 
122 being generated with the method according to the invention. Fig. 1 shows an image 100 
representing a first object 106, a second object 104 which is located behind the first object 
30 106 and a third object 102 which is located behind the second object 104. Fig. 1 further 

shows a path 1 12 from a first pixel 108 to a second pixel 1 10. The path 1 12 corresponds to a 
group of connected pixels. 

In general, a pixel in an image is connected to 8 neighboring pixels, i.e. 2 
pixels being horizontally located relative to the pixel, 2 pixels being vertically located 
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relative to the pixel and 4 pixels being diagonally located relative to the pixel. Pairs of pixels 
of the path 1 12 are mutually located in one of these 3 ways, i.e. horizontally, vertically or 
diagonally relative to each other. 

In Fig. 1 two pairs of pixels, comprising the pixels with respective reference 
5 numbers 1 14, 1 16 and 118, 120 are depicted. The first pair of pixels 114, 1 16 is disposed on 
the path 1 12 on a location corresponding to the transition between the first object 106 to the 
second object 104. The second pair of pixels 118, 120 is disposed on the path 1 12 on another 
location corresponding to the transition between the second object 104 and the first object 
102. 

10 Fig. 1 also shows a depth map 122. The depth map 122 comprises a first group 

128 of depth values corresponding to the first object 106, comprises a second group 126 of 
depth values corresponding to the second object 104 and comprises a third group 124 of 
depth values corresponding to the third object 102. The diepth values of the first group 128 of 
depth values are lower than the depth values of the second group 126 of depth values, 

1 5 meaning that the first object 1 06 is more close to a viewer of the image 100 or of a multi- 
view image which is based on the image 100, than the second object 104. 

The depth map 122 is generated on basis of the method according to the 
invention. For the generation of the depth value 130 corresponding to the first pixel 108 the 
following steps are performed: 

20 - a cost value for the first pixel 108 of the image is computed by combining 

differences between values of pairs of connected pixels 114, 116 and 118, 120, which are 
disposed on a path 1 12 from the first pixel 108 to the second pixel 1 10 which belongs to a 
predetermined subset of the pixels of the image 100; and 

the depth value 130 corresponding to the first pixel 108 is computed by 

25 dividing a constant value by the cost value for the first pixel 108. 

The second pixel 1 10 belongs to a predetermined subset of the pixels of the 
image 100. In this case the predetermined subset comprises pixels at the border of the image. 
In alternative embodiments the subset comprises pixels of a part of the border, e.g. only the 
pixels of the upper border of the image or the lower border of the image. In a further 

30 alternative the subset comprises a central pixel of the image. 

As explained above, the assigned depth value for the first pixel 108 is related 
to a cost function for the first pixel 108. The cost function is based on transitions, i.e. the cost 
value increases when there are more and/or bigger transitions on the path from the first 108 
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pixel to the second pixel 1 10. The assigned depth value can be based on one of the following 
approaches: 

the first pixel is assigned a relatively low depth value if the corresponding cost 
value is relatively low; or 
5 - the first pixel is assigned a relatively high depth value if the corresponding 

cost value is relatively high. 

To summarize, there is a relation between the cost value and the location of 
the second pixel 110 and a relation between the assigned depth value and the cost value. 
Table 1 shows a number of possible relations between these quantities. In the cases as listed 
10 in Table 1, it is assumed that the first pixel 108 is located at the center of image. 



Table 1 Relations between location, cost value and depth value. 



Location of the second pixel 


Cost value of the first pixel 


Depth value of the first pixel 


Upper border 


High 


High 


Upper border 


High 


Low 


Lower border 


High 


High 


Lower border 


High 


Low 


Left/right border 


High 


High 


Left/right border 


High 


Low 


Center 


Low 


High 


Center 


Low 


Low 



A relatively low depth value means that the first pixel is relatively close to the 
15 viewer of the multi view image being generated on basis of the image and a relatively high 
depth value means that the first pixel is relatively far removed from the viewer of the multi 
view image. 

Preferably, the computation of the cost value V(x\y) is based on an 
accumulation of pixel value differences, which are allocated to pixels being located on a path 
20 Pf from the first pixel to the second pixel, with / being an index to indicate a particular one of 
the paths from the pixel with coordinates (x\y) . 



Vix\y) = '^{E(x,y)\{x,y)e P^} 



(1) 
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A first example of the computation of a pixel value difference E(x^y) is given in Equation 2: 

E(x,y) H Kx^y) -I(x-a,y-b)\ (2) 

5 with, I{x^y)the luminance value of a pixel with coordinates x and y of the image and 
-\^a< \ and-l<6<l. 

Alternatively, a pixel value difference E(x,y) is computed on basis of color 

values: 

10 Eix,y) =1 C(x,y) -C{x-a,y-b)\ (3) 

with, C(^x,y} a color value of a pixel with coordinates x and y of the image. In Equation 4 a 
further alternative is given for the computation of a pixel value difference E(x^y) based on 
the three different color components R (Red) G (Green) and B (Blue). 

15 

E(x,y) = max(| Rix,y) 'R(x-a,y- b) U G(x, y) -G(x-a,y- b) |,| B{x,y) -B(x-a,y- b) |) 

(4) 

Optionally, the pixel value difference signal E is filtered by clipping all pixel 
20 value differences, which are below a predetermined threshold, to a constant e.g. zero. 

As said, preferably the computation of the cost value V{x\y) is based on an 
accumulation of pixel value differences being allocated to pixels being located on a path 
P. from the first pixel to the second pixel. There are several approaches to select this path 

from a set of paths. 

25 Fig. 1 schematically shows a path which is based on a simple strategy, i.e. the 

shortest distance from the first pixel 108 to a pixel of the predetermined set being the pixels 
which are located at the left border of the image. That means that the first pixel 1 08 and the 
second pixel 1 10 have the same y-coordinate. 

Fig. 2 schematically shows alternative paths 216, 202. The first one 216 of 

30 these alternative paths corresponds to the shortest distance from the first pixel 108 to a 

second pixel 214 of the predetermined set being the pixels which are located at the left border 
of the image. In this case the restriction that the first pixel 108 and the second pixel 214 have 
mutually equal y-coordinates is not applicable. The second pixel 214 corresponds to the pixel 
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of the left border pixels which can be reached with minimum costs. On the first one 216 of 
these alternative paths the pixels with the following reference numbers 218-224 are disposed. 

Fig. 2 also shows a second one 202 of alternative paths, which corresponds to 
the path reaching the upper border and having minimum costs. It can be clearly seen that the 
5 second one 202 of alternative paths has an irregular shape. On the second one 202 of 
alternative paths the pixels with the following reference numbers 206-212 are disposed. 

Instead of computing the cost value Vix\y) on basis of a single path, the cost 
value can be based on a combination of paths, e.g, the average cost values may be computed. 

A further alternative for computing the cost value Vix\y) is based on 
1 0 weighting factors for the various pixel value differences. 

Vix\y) = '£{WU)Eix,y)\ix,y)G />} (5) 

This weighting factor W(J) is preferably related to a spatial distance J 
15 between one of the pixels of the pixel pair for which a pixel value difference E(x,y) is being 
computed and the first pixel. Typically, this weighting factor W(J) is lower for bigger spatial 
distances. 

Alternatively, the weighting factor W{j) is related to an angle between two 
vectors. Fig. 3 schematically shows a path from a first pixel 108 to a second pixel 1 10, a first 

20 vector 308 from the first pixel 108 to the second pixel 110 and a second vector 306 from a 
third pixel 302 to a fourth pixel 304 on the path 300. Typically, the weighting factor W(j) is 
relatively high if the angle between the first vector 308 and the second vector 306 is 
relatively low, i.e. the orientation of the fourth pixel 304 relative to the third pixel 302 
matches with the orientation of the second pixel 1 10 relative to the first pixel 108. That 

25 means that transitions which match with the first vector 308 are considered to be more 
relevant than transitions which are e.g. perpendicular to the first vector 308. 

Fig. 4 schematically shows a multi-view image generation unit 400 comprising 
a depth map generation unit 401 according to the invention. The multi-view image generation 
unit 400 is arranged to generate a sequence of multi-view images on basis of a sequence of 

30 video images. The multi-view image generation unit 400 is provided with a stream of video 
images at the input connector 408 and provides two correlated streams of video images at the 
output connectors 4 1 0 and 412, respectively. These two correlated streams of video images 
are to be provided to a multi-view display device which is arranged to visualize a first series 
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of views on basis of the first one of the correlated streeims of video images and to visualize a 
second series of views on basis of the second one of the correlated streams of video images. 
If a user, i.e. viewer, observes the first series of views by his left eye and the second series of 
views by his right eye he notices a 3D impression. It might be that the first one of the 
5 correlated streams of video images corresponds to the sequence of video images as received 
and that the second one of the correlated streams of video images is rendered on basis of the 
sequence of video images as received. Preferably, both streams of video images are rendered 
on basis of the sequence of video images image as received. The rendering is e.g. as 
described in the article "Synthesis of multi viewpoint images at non-intermediate positions" 

10 by P.A. Redert, E.A. Hendriks, and J. Biemond, in Proceedings of International Conference 
on Acoustics, Speech, and Signal Processing, Vol. IV, ISBN 0-8186-7919-0, pages 2749- 
2752, IEEE Computer Society, Los Alamitos, California, 1997. Alternatively, the rendering 
is as described in "High-quality images from 2.5D video", by R.P. Berretty and F.E. Ernst, in 
Proceedings Eurographics, Granada, 2003, Short Note 124. 

1 5 The multi-view image generation unit 400 comprises: 

a depth map generation unit 401 for generating depth maps for the respective 
input images on basis of the transitions in the image; and 

a rendering unit 406 for rendering the multi-view images on basis of the input 
images and the respective depth maps, which are provided by the depth map generation unit 

20 401. 

The depth map generating unit 401 for generating depth maps comprising 
depth values representing distances to a viewer, for respective pixels of the images, 
comprises: 

a cost value computing unit 402 for computing a cost value for a first one of 
25 the pixels of the image by combining differences between values of pixels which are 

disposed on a path from the first one of the pixels to a second one of the pixels which belongs 
to a predetermined subset of the pixels of the image. The computation of cost values is as 
described in connection with any of the Figs. 1, 2 and 3; and 

a depth value assigning unit 404 for assigning a first one of the depth values 
30 corresponding to the first one of the pixels on basis of the cost value. 

The computing unit 402 is arranged to provide a cost value signal 
= V(x\y\n) , with coordinates jc* and y of image at time n , which represents per pixel 
the cost value. 
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After the computation of the cost value signal V,, the depth map is determined. 
This is specified in Equation 6: 



10 



D(x\y\n) = F{V^) (6) 

with D(x\y\rt) the depth value of a pixel with coordinates jc* and y of image at time n and 
the function being a linear or non-linear transformation of a cost value V(x\y\n) , into 
a depth value D(x\y\ r2) . This function F{j) is preferably a simple multiplication of the 
cost value V(x\y\r2) with a predetermined constant: 

Dix\y\n) = aV(x\y\n) (7) 



It should be noted that for the computation of the cost value for a particular 
pixel the computed cost value for a neighboring pixel could be applied. In other words, the 
1 5 computation of cost values is preferably performed in a recursive way. See also the 
description in connection with Fig. 6. 

The cost value computing unit 402, the depth value assigning unit 404 and the 
rendering unit 406 may be implemented using one processor. Normally, these functions are 
performed under control of a software program product. During execution, normally the 
20 software program product is loaded into a memory, like a RAM, and executed from there. 
The program may be loaded from a background memory, like a ROM, hard disk, or 
magnetically and/or optical storage, or may be loaded via a network like Internet. Optionally 
an application specific integrated circuit provides the disclosed functionality. 

It should be noted that, although the multi-view image generation unit 400 as 
25 described in connection with Fig. 4 is designed to deal with video images, alternative 

embodiments of the depth map generation unit according to the invention are arranged to 
generate depth maps on basis of individual images, i.e. still pictures. 

Fig. 5 schematically shows an embodiment of the image processing apparatus 
500 according to the invention, comprising: 
30 - a receiving unit 502 for receiving a video signal representing input images; 

a multi-view image generation unit 401 for generating multi-view images on 
basis of the received input images, as described in connection with Fig. 4; and 
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a multi-view display device 506 for displaying the multi-view images as 
provided by the multi-view image generation unit 401 . 

The video signal may be a broadcast signal received via an antenna or cable 
but may also be a signal from a storage device like a VCR (Video Cassette Recorder) or 
5 Digital Versatile Disk (DVD). The signal is provided at the input connector 510. The image 
processing apparatus 500 might e.g. be a TV. Alternatively the image processing apparatus 
500 does not comprise the optional display device but provides the output images to an 
apparatus that does comprise a display device 506. Then the image processing apparatus 500 
might be e.g. a set top box, a satellite-tuner, a VCR player, a DVD player or recorder. 
10 Optionally the image processing apparatus 500 comprises storage means, like a hard-disk or 
means for storage on removable media, e.g. optical disks. The image processing apparatus 
500 might also be a system being applied by a film-studio or broadcaster. 

Fig. 6 schematically shows the sequence of processing respective pixels 600- 
614 of the image 100. By means of arrows 616-620 it is indicated in Fig. 6 what the order of 
15 pixels being processed is. The first row of the image 100, comprising the pixels with 

reference numbers 610-614 is processed from the left to the right. The second row of image 
100, comprising the pixels with reference numbers 604-608 is processed from the right to the 
left. The third row of the image 100, comprising the pixels with reference numbers 600-602 
is processed from the left to the right again. Hence the subsequent rows of pixels of the image 
20 100 are processed in opposite order. 

With processing a particular pixel is meant: 

computing a particular cost value for the particular pixel by combining 
differences between values of pixels which are disposed on a path from the particular pixel to 
a second pixel which belongs to a predetermined subset of the pixels of the image; and 

25 - assigning a particular depth value to the depth map under construction, 

corresponding to the particular pixel on basis of the computed particular cost value. 

Computing the particular cost value is based on already computed cost values 
for other pixels. The following example is provided to illustrate that. Suppose that the depth 
values corresponding to pixels 604-614 of the first and second row have already been 

30 determined, and hence the respective cost values corresponding to respective paths are 

known. Besides that a number of pixels 602 of the third row have also been processed. Next 
the depth value for a particular pixel with reference number 600 has to be determined. 
Preferably, this is done by evaluating the following set of candidate cost values: 
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a first candidate cost value which is based on the cost value of the pixel 602 
being located left from the particular pixel 600 and a pixel value difference between these 
two pixels; 

a second candidate cost value which is based on the cost value of the pixel 606 
5 being located above the particular pixel 600 and a pixel value difference between these two 
pixels; 

a third candidate cost value which is based on the cost value of the pixel 604 
being located left above the particular pixel 600 and a pixel value difference between these 
two pixels; 

10 - a fourth candidate cost value which is based on the cost value of the pixel 608 

being located right above the particular pixel 600 and a pixel value difference between these 
two pixels. 

After determining the minimum cost value from the set of candidate cost 
values the path starting from the particular pixel is known, the corresponding cost value is 

1 5 known and the corresponding depth value can be assigned. 

It will be clear that sets of candidate cost values typically depend on the scan 
direction. For instance in the case of a scan direction from the right to the left, the sets of 
candidate cost values may comprise a candidate cost value which is based on the cost value 
of a pixel being located right from the particular pixel under consideration. The sets of 

20 candidate cost values may comprise additional cost values. Alternatively, the sets of 
candidate cost values comprise less cost values. 

It should be noted that the above-mentioned embodiments illustrate rather than 
limit the invention and that those skilled in the art will be able to design alternative 
embodiments without departing from the scope of the appended claims. In the claims, any 

25 reference signs placed between parentheses shall not be constructed as limiting the claim. 
The word 'comprising' does not exclude the presence of elements or steps not listed in a 
claim. The word "a" or ''an" preceding an element does not exclude the presence of a 
plurality of such elements. The invention can be implemented by means of hardware 
comprising several distinct elements and by means of a suitable programmed computer. In 

30 the unit claims enumerating several means, several of these means can be embodied by one 
and the same item of hardware. The usage of the words first, second and third, etcetera do not 
indicate any ordering. These words are to be interpreted as names. 



